Summary and discussion of: “Why Does Unsupervised Pre-training Help Deep Learning?”
نویسندگان
چکیده
Before getting into how unsupervised pre-training improves the performance of deep architecture, let’s first look into some basics. Let’s start with logistic regression, which is one of the first models for classification that is taught in machine learning. Logistic classification deals with the supervised learning problem of learning a mapping F : X → Y given a set of training points X = {x1 . . .xn} and a set of class labels Y = {y1, . . . , yn} where xi is assigned a class label yi. The mapping is defined by the function p(Y = 1|X) = 1 1+exp (−(WTX+b)) . There is another way of looking at the logistic classifier. One can think of the X as input to a node in a graphical model and the node does two things: it sums up the inputs multiplied by weights of the edges and then applies a sigmoid on the result. A diagram representing such a function is shown in Figure 1. The node that performs the summation and the non-linear transformation is called a neuron. The summation function is called input activation (a(x) = W TX + b) and the non-linear transform (h(x) = g(a(x)) is called output activation of the neuron. Let’s take a look at an example in Figure 2. An AND function can clearly be modelled using a neuron but a XOR function cannot be directly modelled using a single neuron. If
منابع مشابه
Why Does Unsupervised Pre-training Help Deep Learning?
Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of auto-encoder variants, with impressive results obtained in several areas, mostly on vision and language data sets. The best results obtained on supervised learning tasks involve an unsupervised learning component, usually in an unsupervised pre-training phase. Even thou...
متن کاملUnsupervised Feature Learning With Symmetrically Connected Convolutional Denoising Auto-encoders
Unsupervised pre-training was a critical technique for training deep neural networks years ago. With sufficient labeled data and modern training techniques, it is possible to train very deep neural networks from scratch in a purely supervised manner nowadays. However, unlabeled data is easier to obtain and usually of very large scale. How to make use of them better to help supervised learning i...
متن کاملDeep Learning of Representations for Unsupervised and Transfer Learning
Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher-level learned features defined in terms of lower-level features. The objective is to make these higherlevel representations more abstract, with their individual features more invariant to most of the variations that are typical...
متن کاملUnsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models
Relation extraction models based on deep learning have been attracting a lot of attention recently. Little research is carried out to reduce their need of labeled training data. In this work, we propose an unsupervised pre-training method based on the sequence-to-sequence model for deep relation extraction models. The pre-trained models need only half or even less training data to achieve equiv...
متن کاملDeep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کامل